pacman::p_load(sf, tidyverse, tmap, dplyr,
raster, spatstat, spdep,
lubridate, leaflet,
plotly, DT, viridis,
ggplot2, sfdep)Take-Home Exercise 4 - Detailed version
The Task
In this take-home exercise, we are required to select one of the module of our proposed Shiny application and complete the following tasks:
To evaluate and determine the necessary R packages needed for our Shiny application are supported in R CRAN,
To prepare and test the specific R codes can be run and returns the correct output as expected,
To determine the parameters and outputs that will be exposed on the Shiny applications,
To select the appropriate Shiny UI components for exposing the parameters determine above, and
We are required to include a section called UI design for the different components of the UIs for the proposed design.
Getting Started
For this project on the visualisation of Armed conflicts in Myanmar, I will be preparing the codes and UI for the sections on EDA (Proportional Symbol maps), Cluster & Outlier Analysis (LISA) and Hot/Cold zone analysis.
Loading R packages
Importing the ACLED data
Country specific data from the Armed Conflict Location & Event Data Project (ACLED) can be downloaded at https://acleddata.com/data-export-tool/
ACLED_MMR <- read_csv("data/MMR.csv")class(ACLED_MMR)[1] "spec_tbl_df" "tbl_df" "tbl" "data.frame"
Downloading and loading the shape files for country
Shape files were downloaded from the Myanmmar Information Management Unit (MIMU) website at https://geonode.themimu.info/layers/?limit=100&offset=0
This source was chosen over GADM and GeoBoundaries due to its updated administrative region information and map levels.
ACLED captures event data from national, sub-national and other credible media sources, and populates event locations based on the last known information.
However, due to the dynamic nature of conflict and politics, country/administrative boundaries and borders can sometimes be fluid. Names of administrative areas were found to have changed; either disaggregated into new countries/administrative areas or previously active but now defunct. Further, some administrative areas were agglomerated and upgraded into higher tier administrative areas.
As part of our data cleaning and preparation process, I had to identify discrepancies in both admin1 and admin2 data levels and re-name some administrative areas to sync with the downloaded shape files from MIMU.
Data Preparation and Cleaning
Loading Admin1(administrative region/area) shape files
mmr_shp_mimu_1 <- st_read(dsn = "data/geospatial3",
layer = "mmr_polbnda2_adm1_250k_mimu_1")Reading layer `mmr_polbnda2_adm1_250k_mimu_1' from data source
`C:\imranmi\ISSS608-VAA\Take-home-ex\Take-home-Ex4\data\geospatial3'
using driver `ESRI Shapefile'
Simple feature collection with 18 features and 6 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 92.1721 ymin: 9.696844 xmax: 101.17 ymax: 28.54554
Geodetic CRS: WGS 84
class(mmr_shp_mimu_1)[1] "sf" "data.frame"
The Shape file for admin1 level map, is an SF object, with geometry type: Multipolygon
st_geometry(mmr_shp_mimu_1)Geometry set for 18 features
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 92.1721 ymin: 9.696844 xmax: 101.17 ymax: 28.54554
Geodetic CRS: WGS 84
First 5 geometries:
unique_regions_mimu1 <- unique(mmr_shp_mimu_1$ST)
unique_regions_mimu1 [1] "Ayeyarwady" "Bago (East)" "Bago (West)" "Chin" "Kachin"
[6] "Kayah" "Kayin" "Magway" "Mandalay" "Mon"
[11] "Nay Pyi Taw" "Rakhine" "Sagaing" "Shan (East)" "Shan (North)"
[16] "Shan (South)" "Tanintharyi" "Yangon"
There are 18 admin1 levels or states/regions in mmr_shp_mimu_1
Lets compare with our admin1 levels in our main dataset ACLED_MMR
unique_acled_regions1 <- unique(ACLED_MMR$admin1)
unique_acled_regions1 [1] "Rakhine" "Bago-East" "Sagaing" "Shan-North" "Mandalay"
[6] "Mon" "Yangon" "Shan-South" "Kayin" "Kachin"
[11] "Magway" "Ayeyarwady" "Nay Pyi Taw" "Kayah" "Chin"
[16] "Bago-West" "Tanintharyi" "Shan-East"
I will write a simple function to identify the discrepancies between the shape file and the region names in our main dataset.
# Find the unique region names that are in 'unique_acled_regions1' but not in 'unique_regions_mimu1'
mismatched_admin1 <- setdiff(unique_acled_regions1, unique_regions_mimu1)
if (length(mismatched_admin1) > 0) {
print("The following region names from 'acled_mmr' do not match any in 'mimu1':")
print(mismatched_admin1)
} else {
print("All unique region names in 'acled_mmr' match the unique region names in 'mimu1.'")
}[1] "The following region names from 'acled_mmr' do not match any in 'mimu1':"
[1] "Bago-East" "Shan-North" "Shan-South" "Bago-West" "Shan-East"
Lets harmonize the names in both data files. I will resave it to a new data set called ACLED_MMR_1
ACLED_MMR_1 <- ACLED_MMR %>%
mutate(admin1 = case_when(
admin1 == "Bago-East" ~ "Bago (East)",
admin1 == "Bago-West" ~ "Bago (West)",
admin1 == "Shan-North" ~ "Shan (North)",
admin1 == "Shan-South" ~ "Shan (South)",
admin1 == "Shan-East" ~ "Shan (East)",
TRUE ~ as.character(admin1)
))Checking if our changes are successful.
# Get unique admin 1 region names from 'ACLED_MMR_1'
unique_acled_regions1 <- unique(ACLED_MMR_1$admin1)
# Get unique region names from 'mmr_shp_mimu_1'
unique_map_regions_mimu1 <- unique(mmr_shp_mimu_1$ST)
# Find the unique region names that are in 'unique_acled_regions1' but not in 'unique_map_regions_mimu1'
mismatched_regions <- setdiff(unique_acled_regions1, unique_map_regions_mimu1)
if (length(mismatched_regions) > 0) {
print("The following region names from 'acled_mmr_1' do not match any in 'mmr_shp_mimu_1':")
print(mismatched_regions)
} else {
print("All unique region names in 'acled_mmr_1' match the unique region names in 'mmmr_shp_mimu_1.'")
}[1] "All unique region names in 'acled_mmr_1' match the unique region names in 'mmmr_shp_mimu_1.'"
Lets do a sample plot to see how our country map looks like at admin1 level
plot(mmr_shp_mimu_1)
Loading Admin2 (administrative region/area) shape files
mmr_shp_mimu_2 <- st_read(dsn = "data/geospatial3",
layer = "mmr_polbnda_adm2_250k_mimu")Reading layer `mmr_polbnda_adm2_250k_mimu' from data source
`C:\imranmi\ISSS608-VAA\Take-home-ex\Take-home-Ex4\data\geospatial3'
using driver `ESRI Shapefile'
Simple feature collection with 80 features and 7 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 92.1721 ymin: 9.696844 xmax: 101.17 ymax: 28.54554
Geodetic CRS: WGS 84
class(mmr_shp_mimu_2)[1] "sf" "data.frame"
The Shape file for admin2 level map, is an SF object, with geometry type: Multipolygon
st_geometry(mmr_shp_mimu_2)Geometry set for 80 features
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 92.1721 ymin: 9.696844 xmax: 101.17 ymax: 28.54554
Geodetic CRS: WGS 84
First 5 geometries:
unique_regions_mimu2 <- unique(mmr_shp_mimu_2$DT)
unique_regions_mimu2 [1] "Hinthada" "Labutta"
[3] "Maubin" "Myaungmya"
[5] "Pathein" "Pyapon"
[7] "Bago" "Taungoo"
[9] "Pyay" "Thayarwady"
[11] "Falam" "Hakha"
[13] "Matupi" "Mindat"
[15] "Bhamo" "Mohnyin"
[17] "Myitkyina" "Puta-O"
[19] "Bawlake" "Loikaw"
[21] "Hpa-An" "Hpapun"
[23] "Kawkareik" "Myawaddy"
[25] "Gangaw" "Magway"
[27] "Minbu" "Pakokku"
[29] "Thayet" "Kyaukse"
[31] "Maungdaw" "Mrauk-U"
[33] "Sittwe" "Thandwe"
[35] "Hkamti" "Kale"
[37] "Kanbalu" "Katha"
[39] "Kawlin" "Mawlaik"
[41] "Monywa" "Naga Self-Administered Zone"
[43] "Sagaing" "Shwebo"
[45] "Tamu" "Yinmarbin"
[47] "Kengtung" "Monghsat"
[49] "Tachileik" "Hopang"
[51] "Kokang Self-Administered Zone" "Kyaukme"
[53] "Lashio" "Matman"
[55] "Mongmit" "Muse"
[57] "Pa Laung Self-Administered Zone" "Danu Self-Administered Zone"
[59] "Langkho" "Loilen"
[61] "Pa-O Self-Administered Zone" "Taunggyi"
[63] "Dawei" "Kawthoung"
[65] "Mandalay" "Meiktila"
[67] "Myingyan" "Nyaung-U"
[69] "Pyinoolwin" "Yamethin"
[71] "Mawlamyine" "Thaton"
[73] "Det Khi Na" "Oke Ta Ra"
[75] "Kyaukpyu" "Myeik"
[77] "Yangon (East)" "Yangon (North)"
[79] "Yangon (South)" "Yangon (West)"
There are 80 admin2 levels or states/districts in mmr_shp_mimu_2
Lets compare with our admin2 levels in our main dataset ACLED_MMR
unique_acled_regions2 <- unique(ACLED_MMR$admin2)
unique_acled_regions2 [1] "Maungdaw" "Bago"
[3] "Shwebo" "Kyaukme"
[5] "Pyinoolwin" "Muse"
[7] "Sittwe" "Yinmarbin"
[9] "Thaton" "Yangon-North"
[11] "Pa-O Self-Administered Zone" "Hpapun"
[13] "Kyaukpyu" "Yangon-West"
[15] "Mongmit" "Bhamo"
[17] "Mrauk-U" "Yangon-East"
[19] "Yangon-South" "Monywa"
[21] "Gangaw" "Pathein"
[23] "Katha" "Taungoo"
[25] "Kanbalu" "Lashio"
[27] "Mawlamyine" "Myitkyina"
[29] "Kawkareik" "Loilen"
[31] "Mandalay" "Kawlin"
[33] "Kyaukse" "Magway"
[35] "Meiktila" "Pakokku"
[37] "Taunggyi" "Tamu"
[39] "Nay Pyi Taw" "Mohnyin"
[41] "Kale" "Det Khi Na"
[43] "Myingyan" "Loikaw"
[45] "Matupi" "Pyay"
[47] "Sagaing" "Myeik"
[49] "Dawei" "Thayarwady"
[51] "Thandwe" "Mawlaik"
[53] "Bawlake" "Pyapon"
[55] "Hinthada" "Thayet"
[57] "Pa Laung Self-Administered Zone" "Mindat"
[59] "Hkamti" "Kokang Self-Administered Zone"
[61] "Hpa-An" "Danu Self-Administered Zone"
[63] "Myawaddy" "Maubin"
[65] "Hakha" "Falam"
[67] "Minbu" "Monghsat"
[69] "Puta-O" "Hopang"
[71] "Nyaung-U" "Kawthoung"
[73] "Yamethin" "Yangon"
[75] "Myaungmya" "Mong Pawk (Wa SAD)"
[77] "Oke Ta Ra" "Matman"
[79] "Kengtung" "Naga Self-Administered Zone"
[81] "Labutta" "Langkho"
[83] "Tachileik"
I will write a simple function to identify the discrepancies between the shape file and our state/district names in our main dataset.
# Find the unique region names that are in 'unique_acled_regions2' but not in 'unique_regions_mimu2'
mismatched_admin2 <- setdiff(unique_acled_regions2, unique_regions_mimu2)
if (length(mismatched_admin2) > 0) {
print("The following region names from 'acled_mmr' do not match any in 'mimu2':")
print(mismatched_admin2)
} else {
print("All unique region names in 'acled_mmr' match the unique region names in 'mimu2.'")
}[1] "The following region names from 'acled_mmr' do not match any in 'mimu2':"
[1] "Yangon-North" "Yangon-West" "Yangon-East"
[4] "Yangon-South" "Nay Pyi Taw" "Yangon"
[7] "Mong Pawk (Wa SAD)"
Lets harmonize the names in both data files. I will resave it to the previous data set called ACLED_MMR_1
ACLED_MMR_1 <- ACLED_MMR_1 %>%
mutate(admin2 = case_when(
admin2 == "Yangon-East" ~ "Yangon (East)",
admin2 == "Yangon-West" ~ "Yangon (West)",
admin2 == "Yangon-North" ~ "Yangon (North)",
admin2 == "Yangon-South" ~ "Yangon (South)",
admin2 == "Mong Pawk (Wa SAD)" ~ "Tachileik",
admin2 == "Nay Pyi Taw" ~ "Det Khi Na",
admin2 == "Yangon" ~ "Yangon (West)",
TRUE ~ as.character(admin2)
))Checking if our changes are successful.
# Get unique admin 2 district names from 'ACLED_MMR_1'
unique_acled_regions2 <- unique(ACLED_MMR_1$admin2)
# Get unique district names from 'mmr_shp_mimu_2'
unique_map_regions_mimu2 <- unique(mmr_shp_mimu_2$DT)
# Find the unique district names that are in 'unique_acled_regions2' but not in 'unique_map_regions_mimu2'
mismatched_regions2 <- setdiff(unique_acled_regions2, unique_map_regions_mimu2)
if (length(mismatched_regions2) > 0) {
print("The following district names from 'acled_mmr_1' do not match any in 'mmr_shp_mimu_2':")
print(mismatched_regions2)
} else {
print("All unique district names in 'acled_mmr_1' match the unique district names in 'mmmr_shp_mimu_2.'")
}[1] "All unique district names in 'acled_mmr_1' match the unique district names in 'mmmr_shp_mimu_2.'"
Lets do a sample plot to see how our country map looks like at admin2 (districts) level.
plot(mmr_shp_mimu_2)
Data Wrangling
For the purposes of plotting choropleth maps, I will first create attributes subsets for each admin level (1&2) grouped by year, admin region, event type and sub event type.
Data1 <- ACLED_MMR_1 %>%
group_by(year, admin1, event_type, sub_event_type) %>%
summarise(Incidents = n(),
Fatalities = sum(fatalities, na.rm = TRUE)) %>%
ungroup()
Data2 <- ACLED_MMR_1 %>%
group_by(year, admin2, event_type, sub_event_type) %>%
summarise(Incidents = n(),
Fatalities = sum(fatalities, na.rm = TRUE)) %>%
ungroup()Checking the total sum of incidents and fatalities
total_incidents1 <- sum(Data1$Incidents)
total_incidents2 <- sum(Data2$Incidents)
total_fatalities1 <- sum(Data1$Fatalities)
total_fatalities2 <- sum(Data2$Fatalities)total_incidents1[1] 57198
total_incidents2[1] 57198
total_fatalities1[1] 57593
total_fatalities2[1] 57593
Next, I will do a spatial join between my shape files and attribute files
ACLED_MMR_admin1 <- left_join(mmr_shp_mimu_1, Data1,
by = c("ST" = "admin1"))
ACLED_MMR_admin2 <- left_join(mmr_shp_mimu_2, Data2,
by = c("DT" = "admin2"))class(ACLED_MMR_admin1)[1] "sf" "data.frame"
class(ACLED_MMR_admin2)[1] "sf" "data.frame"
Checking that total sum of incidents and fatalities in our SF files are correct as per our original datasets
total_incidents3 <- sum(ACLED_MMR_admin1$Incidents)
total_incidents4 <- sum(ACLED_MMR_admin2$Incidents)
total_fatalities3 <- sum(ACLED_MMR_admin1$Fatalities)
total_fatalities4 <- sum(ACLED_MMR_admin2$Fatalities)
total_incidents3[1] 57198
total_incidents4[1] 57198
total_fatalities3[1] 57593
total_fatalities4[1] 57593
Choropleth Maps
Choropleth Map of Incidents & Fatalities by Admin1 (Region/State)
In the Shiny App, the below choropleth maps can be plotted, with users being able to choose the following:-
Variable: Count of Incidents or Fatalities
specific year or year range
event type and sub event type
data classification type, and
number of clusters (n)
ACLED_MMR_admin1 %>%
filter(year == 2022, event_type == "Battles") %>%
tm_shape() +
tm_fill("Fatalities",
n = 5,
style = "quantile",
palette = "Reds") +
tm_borders(alpha = 0.5)
ACLED_MMR_admin1 %>%
filter(year == 2022, event_type == "Violence against civilians") %>%
tm_shape() +
tm_fill("Fatalities",
n = 5,
style = "quantile",
palette = "Reds") +
tm_borders(alpha = 0.5)
ACLED_MMR_admin1 %>%
filter(year == 2021, event_type == "Riots") %>%
tm_shape() +
tm_fill("Incidents",
n = 5,
style = "jenks",
palette = "Reds") +
tm_borders(alpha = 0.5)
ACLED_MMR_admin1 %>%
filter(year == 2023, event_type == "Battles", sub_event_type == "Armed clash" ) %>%
tm_shape() +
tm_fill("Fatalities",
n = 5,
style = "quantile",
palette = "Reds") +
tm_borders(alpha = 0.5)
ACLED_MMR_admin1 %>%
filter(year == 2023, event_type == "Explosions/Remote violence", sub_event_type == "Shelling/artillery/missile attack" ) %>%
tm_shape() +
tm_fill("Fatalities",
n = 5,
style = "quantile",
palette = "Reds") +
tm_borders(alpha = 0.5)
Choropleth map of Incidents & Fatalities by Admin2 level (by District)
Similarly, in the Shiny App, the below choropleth maps can be plotted, with users being able to choose the following:-
Variable: Count of Incidents or Fatalities
specific year or year range
event type and sub event type
data classification type, and
number of clusters (n)
ACLED_MMR_admin2 %>%
filter(year == 2023, event_type == "Battles") %>%
tm_shape() +
tm_fill("Fatalities",
n = 5,
style = "quantile",
palette = "Reds") +
tm_borders(alpha = 0.5)
ACLED_MMR_admin2 %>%
filter(year == 2021, event_type == "Violence against civilians", sub_event_type == "Attack" ) %>%
tm_shape() +
tm_fill("Incidents",
n = 5,
style = "quantile",
palette = "Reds") +
tm_borders(alpha = 0.5)
ACLED_MMR_admin2 %>%
filter(year == 2023, event_type == "Explosions/Remote violence", sub_event_type == "Air/drone strike" ) %>%
tm_shape() +
tm_fill("Incidents",
n = 5,
style = "quantile",
palette = "Reds") +
tm_borders(alpha = 0.5)
Incidents and Fatalities by individual regions using tm_facet()
ACLED_MMR_admin1 %>%
filter(year == 2023, event_type == "Battles", sub_event_type == "Armed clash" ) %>%
tm_shape( ) +
tm_fill("Fatalities",
style = "quantile",
palette = "Reds",
thres.poly = 0) +
tm_facets(by="ST",
free.coords=TRUE,
drop.shapes=TRUE) +
tm_layout(legend.show = FALSE,
title.position = c("center", "center"),
title.size = 20) +
tm_borders(alpha = 0.5)
ACLED_MMR_admin1 %>%
filter(year == 2022, event_type == "Violence against civilians") %>%
tm_shape( ) +
tm_fill("Incidents",
style = "quantile",
palette = "Reds",
thres.poly = 0) +
tm_facets(by="ST",
free.coords=TRUE,
drop.shapes=TRUE) +
tm_layout(legend.show = FALSE,
title.position = c("center", "center"),
title.size = 20) +
tm_borders(alpha = 0.5)
The above plots using tm_facet() may not necessarily add value to users, so I will KIV for the app.
Proportional Symbol Maps
Proportional symbol maps (also known as graduate symbol maps) are a class of maps that use the visual variable of size to represent differences in the magnitude of a discrete, abruptly changing phenomenon, e.g. counts of people.
First I will convert the ACLED_MMR_1 data set to become an sf object
# Convert conflict data to an sf object
conflict_sf <- st_as_sf(ACLED_MMR_1, coords = c("longitude", "latitude"), crs = 4326)class(conflict_sf)[1] "sf" "tbl_df" "tbl" "data.frame"
conflict_sfSimple feature collection with 57198 features and 33 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 92.199 ymin: 9.9824 xmax: 100.3576 ymax: 27.731
Geodetic CRS: WGS 84
# A tibble: 57,198 × 34
event_id_cnty event_date year time_precision disorder_type event_type
* <chr> <chr> <dbl> <dbl> <chr> <chr>
1 MMR58558 16 February 2024 2024 1 Political vio… Battles
2 MMR58559 16 February 2024 2024 1 Political vio… Battles
3 MMR58443 16 February 2024 2024 2 Political vio… Violence …
4 MMR58502 16 February 2024 2024 1 Political vio… Violence …
5 MMR58507 16 February 2024 2024 1 Political vio… Explosion…
6 MMR58508 16 February 2024 2024 1 Political vio… Explosion…
7 MMR58547 16 February 2024 2024 1 Strategic dev… Strategic…
8 MMR58560 16 February 2024 2024 1 Political vio… Battles
9 MMR58589 16 February 2024 2024 2 Political vio… Violence …
10 MMR58648 16 February 2024 2024 1 Political vio… Explosion…
# ℹ 57,188 more rows
# ℹ 28 more variables: sub_event_type <chr>, actor1 <chr>, assoc_actor_1 <chr>,
# inter1 <dbl>, actor2 <chr>, assoc_actor_2 <chr>, inter2 <dbl>,
# interaction <dbl>, civilian_targeting <chr>, iso <dbl>, region <chr>,
# country <chr>, admin1 <chr>, admin2 <chr>, admin3 <chr>, location <chr>,
# geo_precision <dbl>, source <chr>, source_scale <chr>, notes <chr>,
# fatalities <dbl>, tags <chr>, timestamp <dbl>, population_1km <dbl>, …
Next, I create subsets for each event type, each subset will inherit the SF object characteristic.
Battles <- filter(conflict_sf, event_type == "Battles")
Violence_CV <- filter(conflict_sf, event_type == "Violence against civilians")
Protests <- filter(conflict_sf, event_type == "Protests")
Riots <- filter(conflict_sf, event_type == "Riots")
Explosions <- filter(conflict_sf, event_type == "Explosions/Remote violence")
Strategic_dev <- filter(conflict_sf, event_type == "Strategic developments")class(Battles)[1] "sf" "tbl_df" "tbl" "data.frame"
Visualising of Fatalities by Event Type
In the shiny App , we can enable users to filter by
specific year,
year range, and
event_type
range of number of fatalities
By using subsets of event types which have been converted to sf objects.
I will use the Geometry points from our data sets to plot the points of events in the maps using the leaflet package.
In this case, i will use admin 1 level regions, to achieve a better aesthetics for users. ie dividing the country map into more districts would likely look too busy, and not value add to users. Further, additional information on the specific event type, region, year, no of fatalities and actors involved can be communicated by means of the tooltip.
scaleFactor <- 2
leaflet(Battles) %>%
addProviderTiles(providers$CartoDB.Positron) %>%
addPolygons(data = mmr_shp_mimu_1, color = "#444444", weight = 1, fillOpacity = 0.5) %>% # Adding borders
addCircleMarkers(popup = ~paste("Event: Battles<br>State/Region:", admin1,
"<br>Actor1:", actor1, "<br>Actor2:", actor2,
"<br>Year:", year, "<br>Fatalities:", fatalities),
radius = ~sqrt(fatalities) * scaleFactor,
fillColor = "red", fillOpacity = 0.4, color = "#FFFFFF", weight = 1) %>%
setView(lng = 96.1603, lat = 19.745, zoom = 6)scaleFactor <- 2
leaflet(Violence_CV) %>%
addProviderTiles(providers$CartoDB.Positron) %>%
addPolygons(data = mmr_shp_mimu_1, color = "#444444", weight = 1, fillOpacity = 0.5) %>% # Adding borders
addCircleMarkers(popup = ~paste("Event: Violence on Civillians<br>State/Region:", admin1,
"<br>Actor1:", actor1, "<br>Actor2:", actor2,
"<br>Year:", year, "<br>Fatalities:", fatalities),
radius = ~sqrt(fatalities) * scaleFactor,
fillColor = "red", fillOpacity = 0.4, color = "#FFFFFF", weight = 1) %>%
setView(lng = 96.1603, lat = 19.745, zoom = 6)scaleFactor <- 2
leaflet(Protests) %>%
addProviderTiles(providers$CartoDB.Positron) %>%
addPolygons(data = mmr_shp_mimu_1, color = "#444444", weight = 1, fillOpacity = 0.5) %>% # Adding borders
addCircleMarkers(popup = ~paste("Event: Protests<br>State/Region:", admin1,
"<br>Actor1:", actor1, "<br>Actor2:", actor2,
"<br>Year:", year, "<br>Fatalities:", fatalities),
radius = ~sqrt(fatalities) * scaleFactor,
fillColor = "red", fillOpacity = 0.4, color = "#FFFFFF", weight = 1) %>%
setView(lng = 96.1603, lat = 19.745, zoom = 6)scaleFactor <- 2
leaflet(Riots) %>%
addProviderTiles(providers$CartoDB.Positron) %>%
addPolygons(data = mmr_shp_mimu_1, color = "#444444", weight = 1, fillOpacity = 0.5) %>% # Adding borders
addCircleMarkers(popup = ~paste("Event: Riots<br>State/Region:", admin1,
"<br>Actor1:", actor1, "<br>Actor2:", actor2,
"<br>Year:", year, "<br>Fatalities:", fatalities),
radius = ~sqrt(fatalities) * scaleFactor,
fillColor = "red", fillOpacity = 0.4, color = "#FFFFFF", weight = 1) %>%
setView(lng = 96.1603, lat = 19.745, zoom = 6)scaleFactor <- 2
leaflet(Explosions) %>%
addProviderTiles(providers$CartoDB.Positron) %>%
addPolygons(data = mmr_shp_mimu_1, color = "#444444", weight = 1, fillOpacity = 0.5) %>% # Adding borders
addCircleMarkers(popup = ~paste("Event: Explosions<br>State/Region:", admin1,
"<br>Actor1:", actor1, "<br>Actor2:", actor2,
"<br>Year:", year, "<br>Fatalities:", fatalities),
radius = ~sqrt(fatalities) * scaleFactor,
fillColor = "red", fillOpacity = 0.4, color = "#FFFFFF", weight = 1) %>%
setView(lng = 96.1603, lat = 19.745, zoom = 6)Data Preparation for Spatial Analysis
First I will create subsets of our Events happening in admin region 1 & 2, summarized with the number(count) of incidents and fatalities.
Events1 <- ACLED_MMR_1 %>%
group_by(year, admin1, event_type) %>%
summarise(Incidents = n(),
Fatalities = sum(fatalities, na.rm = TRUE)) %>%
ungroup()
Events2 <- ACLED_MMR_1 %>%
group_by(year, admin2, event_type) %>%
summarise(Incidents = n(),
Fatalities = sum(fatalities, na.rm = TRUE)) %>%
ungroup()Next, we will perform a relational join to update our admin 1 and admin 2 level shape files with attributes fields of the above event related data.
Events_admin1 <- left_join(mmr_shp_mimu_1, Events1,
by = c("ST" = "admin1"))
Events_admin2 <- left_join(mmr_shp_mimu_2, Events2,
by = c("DT" = "admin2"))The class of the file is SF objects file
class(Events_admin1)[1] "sf" "data.frame"
st_geometry(Events_admin2)Geometry set for 2748 features
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 92.1721 ymin: 9.696844 xmax: 101.17 ymax: 28.54554
Geodetic CRS: WGS 84
First 5 geometries:
Next, I will create a subset of the event type and year.
For example, for the puposes of this exercise, the codes below will be used to analyse for Event type = Battles, in the year 2023
I will name the object as Battles_2023, eventually this object file be named generically (eg: Event_Year), so that users can select for different event types and different years in the Shiny app.
Filtering the Event and Year (Event type = Battles, in 2023)
Battles_2023 <- Events_admin2 %>%
filter(year == 2023, event_type == "Battles")Computing Contiguity Spatial Weights
Before we can compute any spatial statistics, we need to construct spatial weights of the study area.
The spatial weights is used to define the neighbourhood relationships between the geographical units (i.e. admin2) in the study area (Myanmar).
In the code below, poly2nb() of spdep package is used to compute contiguity weight matrices for the study area. This function builds a neighbours list based on regions with contiguous boundaries.
By default this function will return a list of first order neighbours using the Queen criteria.
However, we can pass a “queen” argument that takes TRUE or FALSE as options.
wm_q <- poly2nb(Battles_2023,
queen=TRUE)
summary(wm_q)Neighbour list object:
Number of regions: 74
Number of nonzero links: 356
Percentage nonzero weights: 6.501096
Average number of links: 4.810811
Link number distribution:
1 2 3 4 5 6 7 8 10
3 8 11 11 15 9 9 6 2
3 least connected regions:
2 16 58 with 1 link
2 most connected regions:
19 56 with 10 links
The summary report above shows that there are 74 area units for this subset (Battles occurring in 2023).
There are 2 most connected area units with 10 neighbours, and there are 3 area units with only 1 neighbours.
Row-standardised weights matrix
Next, we assign weights to each neighboring polygon. In our case, each neighboring polygon will be assigned equal weight (style=“W”). This is accomplished by assigning the fraction 1/(#ofneighbors) to each neighboring admin2 (district) and then summing the weighted income values.
This has one drawback in that polygons along the edges of the study area will base their lagged values on fewer polygons and thus potentially over- or under-estimating the true nature of the spatial autocorrelation in the data.
For this example, I will stick with the style=“W” option for simplicity’s sake but note that other more robust options are available, notably style=“B”.
rswm_q <- nb2listw(wm_q,
style="W",
zero.policy = TRUE)
rswm_qCharacteristics of weights list object:
Neighbour list object:
Number of regions: 74
Number of nonzero links: 356
Percentage nonzero weights: 6.501096
Average number of links: 4.810811
Weights style: W
Weights constants summary:
n nn S0 S1 S2
W 74 5476 74 36.81702 310.0455
While GLOBAL Moran’s I score and the Geary’s C ratio can tell us whether specific event types (Battles, Explosions, Protests, Riots, Violence against civilians) tends to cluster or not on the map, it does not provide any information on the distribution of spatial dependence of Events types, and is unable to identify the location of hotspots and clusters.
For that, we require the use of more localized methods - Anselin’s Moran Scatterplot and the Local Indicator of Spatial Autocorrelation (LISA) method.
Cluster and Outlier Analysis
Local Indicators of Spatial Association or LISA are statistics that evaluate the existence of clusters in the spatial arrangement of a given variable. For example, in this analysis, we are studying if there are areas that have higher or lower incidents of a specific Event type (Battles) than is to be expected by chance alone, ie the values occurring are above or below those of a random distribution in space.
Computing local Moran’s I
To compute local Moran’s I, the localmoran() function of spdep will be used. It computes Ii values, given a set of zi values and a listw object providing neighbour weighting information for the polygon associated with the zi values.
fips <- order(Battles_2023$DT)
localMI <- localmoran(Battles_2023$Incidents, rswm_q)
head(localMI) Ii E.Ii Var.Ii Z.Ii Pr(z != E(Ii))
1 0.46274887 -9.006432e-03 0.1247560902 1.3356292 0.1816705
2 0.71317847 -1.016877e-02 0.7448368248 0.8381394 0.4019524
3 0.69218038 -9.386041e-03 0.3392457987 1.2045132 0.2283913
4 0.68518101 -9.386041e-03 0.3392457987 1.1924960 0.2330668
5 0.00627746 -1.483579e-05 0.0001702656 0.4822206 0.6296493
6 -0.23004674 -4.814215e-03 0.0464274459 -1.0453066 0.2958813
localmoran() function returns a matrix of values whose columns are:
Ii: the local Moran’s I statistics
E.Ii: the expectation of local moran statistic under the randomisation hypothesis
Var.Ii: the variance of local moran statistic under the randomisation hypothesis
Z.Ii:the standard deviate of local moran statistic
Pr(): the p-value of local moran statistic
The code below lists the content of the local Moran matrix derived by using printCoefmat().
printCoefmat(data.frame(
localMI[fips,],
row.names=Battles_2023$DT[fips]),
check.names=FALSE) Ii E.Ii Var.Ii Z.Ii
Bago 6.2775e-03 -1.4836e-05 1.7027e-04 4.8222e-01
Bawlake -1.7909e-02 -4.6941e-04 1.1252e-02 -1.6441e-01
Bhamo 2.4621e-01 -1.7367e-03 1.9897e-02 1.7577e+00
Danu Self-Administered Zone 3.3657e-01 -8.2707e-03 1.9670e-01 7.7752e-01
Dawei 9.4559e-01 -1.4130e-02 5.0825e-01 1.3462e+00
Det Khi Na 2.3024e-01 -6.5686e-03 6.3234e-02 9.4170e-01
Falam -7.5203e-03 -2.4381e-03 5.8326e-02 -2.1044e-02
Gangaw 6.2609e-01 -6.9289e-03 6.6679e-02 2.4514e+00
Hakha -1.1347e-02 -6.1003e-05 1.0815e-03 -3.4318e-01
Hinthada 4.6275e-01 -9.0064e-03 1.2476e-01 1.3356e+00
Hkamti -5.3716e-02 -3.0598e-03 3.5009e-02 -2.7074e-01
Hopang -2.3647e-01 -9.7735e-03 3.5311e-01 -3.8150e-01
Hpa-An -1.3231e-01 -3.0598e-03 1.9751e-02 -9.1968e-01
Hpapun -4.1969e-02 -4.2526e-03 5.9189e-02 -1.5503e-01
Kale 1.2532e-01 -7.7384e-04 6.4571e-03 1.5692e+00
Kanbalu -3.2905e-02 -3.4002e-05 3.9022e-04 -1.6640e+00
Katha -8.8073e-03 -1.8682e-02 1.5309e-01 2.5238e-02
Kawkareik 2.9062e-02 -1.3663e-02 3.2318e-01 7.5156e-02
Kawlin -7.2541e-02 -1.4063e-03 3.3678e-02 -3.8762e-01
Kawthoung -6.7102e-01 -3.2826e-03 2.4212e-01 -1.3570e+00
Kokang Self-Administered Zone 9.6449e-04 -1.1447e-08 2.7452e-07 1.8408e+00
Kyaukme -6.9939e-02 -9.0471e-03 8.6877e-02 -2.0659e-01
Kyaukpyu 4.7922e-01 -6.8933e-03 1.2137e-01 1.3953e+00
Kyaukse -1.4472e-01 -9.0064e-03 7.4533e-02 -4.9713e-01
Labutta 7.1318e-01 -1.0169e-02 7.4484e-01 8.3814e-01
Langkho -9.6769e-03 -8.6347e-03 2.0528e-01 -2.3004e-03
Lashio 2.0219e-01 -4.2805e-03 4.8917e-02 9.3352e-01
Loikaw -4.7652e-01 -2.3869e-02 3.2568e-01 -7.9318e-01
Loilen -2.8804e-02 -5.6413e-03 7.8408e-02 -8.2718e-02
Magway 9.4077e-02 -5.9426e-03 4.9330e-02 4.5033e-01
Mandalay -2.2955e-01 -3.0598e-03 7.3153e-02 -8.3742e-01
Matupi -1.7986e-02 -3.2117e-04 4.4878e-03 -2.6370e-01
Maungdaw 1.2575e-01 -2.6375e-03 6.3084e-02 5.1117e-01
Mawlaik -3.2557e-02 -2.6375e-03 3.0190e-02 -1.7220e-01
Mawlamyine 2.4766e-01 -3.3072e-03 5.8440e-02 1.0381e+00
Meiktila 3.8976e-01 -8.2707e-03 7.9484e-02 1.4118e+00
Minbu -1.9321e-01 -6.2516e-03 6.0203e-02 -7.6195e-01
Mindat 2.2211e-02 -8.7518e-04 1.5503e-02 1.8541e-01
Mohnyin 1.1002e-01 -8.8788e-04 1.5727e-02 8.8440e-01
Mongmit -4.9751e-01 -8.6347e-03 1.1965e-01 -1.4133e+00
Monywa 3.7540e+00 -3.3925e-02 5.8106e-01 4.9692e+00
Mrauk-U 2.1574e-01 -3.9983e-03 4.5705e-02 1.0278e+00
Muse 2.6578e-01 -1.6313e-01 2.4204e+00 2.7569e-01
Myawaddy 1.8035e-02 -6.4391e-05 2.3492e-03 3.7341e-01
Myeik 3.6057e-01 -2.5739e-02 9.1496e-01 4.0386e-01
Myingyan 4.3732e-02 -1.0008e-04 1.3987e-03 1.1720e+00
Myitkyina -1.9413e-01 -6.2855e-03 8.7306e-02 -6.3572e-01
Naga Self-Administered Zone -8.8212e-02 -1.0169e-02 3.6725e-01 -1.2878e-01
Nyaung-U -5.4115e-01 -8.6347e-03 1.5176e-01 -1.3669e+00
Oke Ta Ra 5.7519e-01 -9.0064e-03 1.5824e-01 1.4686e+00
Pa-O Self-Administered Zone 1.9941e-01 -5.6413e-03 5.4359e-02 8.7950e-01
Pa Laung Self-Administered Zone -5.3309e-01 -5.0623e-03 7.0402e-02 -1.9901e+00
Pakokku 1.9341e+00 -2.2766e-01 1.4683e+00 1.7840e+00
Pathein 6.9218e-01 -9.3860e-03 3.3925e-01 1.2045e+00
Puta-O -4.8052e-01 -6.8933e-03 5.0659e-01 -6.6543e-01
Pyapon 6.8518e-01 -9.3860e-03 3.3925e-01 1.1925e+00
Pyay 1.0545e-01 -1.2618e-03 1.7615e-02 8.0407e-01
Pyinoolwin 4.6151e-01 -2.7025e-02 2.1958e-01 1.0426e+00
Sagaing 9.5824e-01 -1.0212e-02 9.7948e-02 3.0944e+00
Shwebo 2.1412e+00 -5.0075e-02 5.4593e-01 2.9657e+00
Sittwe 2.3135e-01 -3.0598e-03 1.1130e-01 7.0266e-01
Tamu 3.2174e-02 -1.8902e-04 3.3506e-03 5.5911e-01
Taunggyi -1.0168e-01 -1.1395e-03 7.3697e-03 -1.1711e+00
Taungoo -2.3005e-01 -4.8142e-03 4.6427e-02 -1.0453e+00
Thandwe 5.6456e-01 -1.0169e-02 1.4069e-01 1.5322e+00
Thaton -6.7767e-02 -3.0835e-03 5.4499e-02 -2.7708e-01
Thayarwady 9.0973e-03 -1.4836e-05 2.0737e-04 6.3277e-01
Thayet 2.9530e-01 -5.3479e-03 5.1546e-02 1.3242e+00
Yamethin 4.3920e-01 -9.7735e-03 1.3528e-01 1.2207e+00
Yangon (East) 6.1100e-01 -7.5664e-03 1.8008e-01 1.4576e+00
Yangon (North) 4.4954e-01 -9.3860e-03 1.0671e-01 1.4049e+00
Yangon (South) 5.2023e-01 -8.6347e-03 1.1965e-01 1.5289e+00
Yangon (West) 6.6585e-01 -9.7735e-03 2.3209e-01 1.4024e+00
Yinmarbin 4.5788e+00 -9.9115e-02 1.2481e+00 4.1872e+00
Pr.z....E.Ii..
Bago 0.6296
Bawlake 0.8694
Bhamo 0.0788
Danu Self-Administered Zone 0.4369
Dawei 0.1782
Det Khi Na 0.3463
Falam 0.9832
Gangaw 0.0142
Hakha 0.7315
Hinthada 0.1817
Hkamti 0.7866
Hopang 0.7028
Hpa-An 0.3577
Hpapun 0.8768
Kale 0.1166
Kanbalu 0.0961
Katha 0.9799
Kawkareik 0.9401
Kawlin 0.6983
Kawthoung 0.1748
Kokang Self-Administered Zone 0.0656
Kyaukme 0.8363
Kyaukpyu 0.1629
Kyaukse 0.6191
Labutta 0.4020
Langkho 0.9982
Lashio 0.3506
Loikaw 0.4277
Loilen 0.9341
Magway 0.6525
Mandalay 0.4024
Matupi 0.7920
Maungdaw 0.6092
Mawlaik 0.8633
Mawlamyine 0.2992
Meiktila 0.1580
Minbu 0.4461
Mindat 0.8529
Mohnyin 0.3765
Mongmit 0.1576
Monywa 0.0000
Mrauk-U 0.3040
Muse 0.7828
Myawaddy 0.7088
Myeik 0.6863
Myingyan 0.2412
Myitkyina 0.5250
Naga Self-Administered Zone 0.8975
Nyaung-U 0.1716
Oke Ta Ra 0.1419
Pa-O Self-Administered Zone 0.3791
Pa Laung Self-Administered Zone 0.0466
Pakokku 0.0744
Pathein 0.2284
Puta-O 0.5058
Pyapon 0.2331
Pyay 0.4214
Pyinoolwin 0.2972
Sagaing 0.0020
Shwebo 0.0030
Sittwe 0.4823
Tamu 0.5761
Taunggyi 0.2415
Taungoo 0.2959
Thandwe 0.1255
Thaton 0.7817
Thayarwady 0.5269
Thayet 0.1854
Yamethin 0.2222
Yangon (East) 0.1449
Yangon (North) 0.1601
Yangon (South) 0.1263
Yangon (West) 0.1608
Yinmarbin 0.0000
Mapping the local Moran’s I
Before mapping the local Moran’s I map, we will need to append the local Moran’s I dataframe (i.e. localMI) onto the Battles_2023’s SF DataFrame.
Battles_2023.localMI <- cbind(Battles_2023,localMI) %>%
rename(Pr.Ii = Pr.z....E.Ii..)Battles_2023.localMISimple feature collection with 74 features and 16 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 92.1721 ymin: 9.696844 xmax: 99.66532 ymax: 28.54554
Geodetic CRS: WGS 84
First 10 features:
OBJECTID ST ST_PCODE DT DT_PCODE DT_MMR PCode_V year
1 1 Ayeyarwady MMR017 Hinthada MMR017D002 ဟင်္သာတခရိုင် 9.4 2023
2 2 Ayeyarwady MMR017 Labutta MMR017D004 လပွတ္တာခရိုင် 9.4 2023
3 5 Ayeyarwady MMR017 Pathein MMR017D001 ပုသိမ်ခရိုင် 9.4 2023
4 6 Ayeyarwady MMR017 Pyapon MMR017D006 ဖျာပုံခရိုင် 9.4 2023
5 7 Bago (East) MMR007 Bago MMR007D001 ပဲခူးခရိုင် 9.4 2023
6 8 Bago (East) MMR007 Taungoo MMR007D002 တောင်ငူခရိုင် 9.4 2023
7 9 Bago (West) MMR008 Pyay MMR008D001 ပြည်ခရိုင် 9.4 2023
8 10 Bago (West) MMR008 Thayarwady MMR008D002 သာယာဝတီခရိုင် 9.4 2023
9 11 Chin MMR004 Falam MMR004D001 ဖလမ်းခရိုင် 9.4 2023
10 12 Chin MMR004 Hakha MMR004D003 ဟားခါးခရိုင် 9.4 2023
event_type Incidents Fatalities Ii E.Ii Var.Ii
1 Battles 4 3 0.462748867 -9.006432e-03 0.1247560902
2 Battles 1 1 0.713178468 -1.016877e-02 0.7448368248
3 Battles 3 1 0.692180375 -9.386041e-03 0.3392457987
4 Battles 3 2 0.685181011 -9.386041e-03 0.3392457987
5 Battles 50 270 0.006277460 -1.483579e-05 0.0001702656
6 Battles 87 493 -0.230046740 -4.814215e-03 0.0464274459
7 Battles 34 59 0.105454317 -1.261774e-03 0.0176145487
8 Battles 50 93 0.009097304 -1.483579e-05 0.0002073682
9 Battles 27 89 -0.007520292 -2.438086e-03 0.0583263538
10 Battles 48 210 -0.011346560 -6.100301e-05 0.0010814666
Z.Ii Pr.Ii geometry
1 1.33562921 0.1816705 MULTIPOLYGON (((95.12637 18...
2 0.83813940 0.4019524 MULTIPOLYGON (((95.04462 15...
3 1.20451317 0.2283913 MULTIPOLYGON (((94.27572 15...
4 1.19249602 0.2330668 MULTIPOLYGON (((95.20798 15...
5 0.48222056 0.6296493 MULTIPOLYGON (((95.90674 18...
6 -1.04530664 0.2958813 MULTIPOLYGON (((96.17964 19...
7 0.80407054 0.4213562 MULTIPOLYGON (((95.70458 19...
8 0.63277490 0.5268807 MULTIPOLYGON (((95.85173 18...
9 -0.02104359 0.9832109 MULTIPOLYGON (((93.36931 24...
10 -0.34317564 0.7314663 MULTIPOLYGON (((93.35213 23...
Mapping local Moran’s I values
Using choropleth mapping functions of tmap package, we can plot the local Moran’s I values by using the code below.
tm_shape(Battles_2023.localMI) +
tm_fill(col = "Ii",
style = "pretty",
palette = "RdBu",
title = "local moran statistics") +
tm_borders(alpha = 0.5)
Mapping local Moran’s I p-values
The choropleth shows there is evidence for both positive and negative Ii values. However, we will also need to consider the p-values for each of these values.
The code below produces a choropleth map of Moran’s I p-values by using functions of tmap package.
tm_shape(Battles_2023.localMI) +
tm_fill(col = "Pr.Ii",
breaks=c(-Inf, 0.001, 0.01, 0.05, 0.1, Inf),
palette="-Blues",
title = "local Moran's I p-values") +
tm_borders(alpha = 0.5)
Mapping both local Moran’s I values and p-values
For the Shiny App, it will be better to plot both the local Moran’s I values map and its corresponding p-values map next to each other.
The code below will be used to create such visualisation.
localMI.map <- tm_shape(Battles_2023.localMI) +
tm_fill(col = "Ii",
style = "pretty",
title = "local moran statistics") +
tm_borders(alpha = 0.5)
pvalue.map <- tm_shape(Battles_2023.localMI) +
tm_fill(col = "Pr.Ii",
breaks=c(-Inf, 0.001, 0.01, 0.05, 0.1, Inf),
palette="-Blues",
title = "local Moran's I p-values") +
tm_borders(alpha = 0.5)
tmap_arrange(localMI.map, pvalue.map, asp=1, ncol=2)
Creating a LISA Cluster Map
The LISA Cluster Map shows the significant locations color coded by type of spatial autocorrelation. The first step before we can generate the LISA cluster map is to plot the Moran scatterplot.
Plotting Moran scatterplot
The Moran scatterplot is an illustration of the relationship between the values of the chosen attribute at each location and the average value of the same attribute at neighboring locations.
The code below plots the Moran scatterplot of Battles in 2023 by using moran.plot() of spdep.
nci <- moran.plot(Battles_2023$Incidents, rswm_q,
labels=as.character(Battles_2023$DT),
xlab="Battles_2023",
ylab="Spatially Lagged Events,Year")
The plot is split in 4 quadrants. The top right corner belongs to areas that have high incidents of events and are surrounded by other areas that have higher than the average level/number of battles This is the high-high locations.
The Moran scatterplot is divided into four areas, with each quadrant corresponding with one of four categories: (1) High-High (HH) in the top-right quadrant; (2) High-Low (HL) in the bottom right quadrant; (3) Low-High (LH) in the top-left quadrant; (4) Low- Low (LL) in the bottom left quadrant.
Plotting Moran scatterplot with standardised variable
First I will use scale() to centre and scale the variable. Here centering is done by subtracting the mean (omitting NAs) the corresponding columns, and scaling is done by dividing the (centred) variable by their standard deviations.
Battles_2023$Z.Incidents <- scale(Battles_2023$Incidents) %>%
as.vector The as.vector() added to the end is to make sure that the data type we get out of this is a vector, that maps neatly into our dataframe.
Next, we plot the Moran scatterplot again by using the code below.
nci2 <- moran.plot(Battles_2023$Z.Incidents, rswm_q,
labels=as.character(Battles_2023$DT),
xlab="z-Battles in 2023",
ylab="Spatially Lag z-Battles in 2023")
Moran Scatterplots from 2020-2023 , Battles
2020

2021

2022

2023

Both types of scatter plots can be implemented for the Shiny app, with users being able to select between standardised or non-standardised variables.
High-High (HH): indicates high spatial correlation where incidents of Battles are clustered closely together. 2) High-Low (HL): where areas of high frequency of incidents of Battles occurred are located next to areas where there is low frequency of incidents of Battles occurred. 3) Low-High (LH): these are areas of low frequency of incidents where Battles occurred that are located next to areas where high frequency of Battles. 4) Low-Low (LL): these are clusters of low frequency of incidents of Battles occurred.
Preparing LISA map classes
The Moran Scatterplot has one drawback - it does not indicate whether these regions are significant or not. We can address this limitation by using the LISA method. LISA will not only allow us to identify the hotspot locations, but also the statistical significance of the hot spots in the map.
According to Anselin (1995), LISA can be used to locate “hot spots” or local spatial clusters where the occurrence of Event types is statistically significant.
In addition to the four categories described in the Moran Scatterplot, the LISA analysis includes an additional category: (5) Insignificant: where there are no spatial autocorrelation or clusters where event types have occurred.
The code below shows the steps to prepare a LISA cluster map.
quadrant <- vector(mode="numeric",length=nrow(localMI))Next, we derive the spatially lagged variable of interest (i.e. Incidents) and centers the spatially lagged variable around its mean.
Battles_2023$lag_Incidents <- lag.listw(rswm_q, Battles_2023$Incidents)
DV <- Battles_2023$lag_Incidents - mean(Battles_2023$lag_Incidents) This is followed by centering the local Moran’s around the mean.
LM_I <- localMI[,1] - mean(localMI[,1]) Next, we will set a statistical significance level for the local Moran.
signif <- 0.05 These four command lines define the low-low (1), low-high (2), high-low (3) and high-high (4) categories.
quadrant[DV <0 & LM_I>0] <- 1
quadrant[DV >0 & LM_I<0] <- 2
quadrant[DV <0 & LM_I<0] <- 3
quadrant[DV >0 & LM_I>0] <- 4 Lastly, we place non-significant Moran in the category 0.
quadrant[localMI[,5]>signif] <- 0We can also combine all the steps above into one single code chunk as shown below:
quadrant <- vector(mode="numeric",length=nrow(localMI))
Battles_2023$lag_Incidents <- lag.listw(rswm_q, Battles_2023$Incidents)
DV <- Battles_2023$lag_Incidents - mean(Battles_2023$lag_Incidents)
LM_I <- localMI[,1]
signif <- 0.05
quadrant[DV <0 & LM_I>0] <- 1
quadrant[DV >0 & LM_I<0] <- 2
quadrant[DV <0 & LM_I<0] <- 3
quadrant[DV >0 & LM_I>0] <- 4
quadrant[localMI[,5]>signif] <- 0Plotting LISA Map
Now, we can build the LISA map by using the code below.
Battles_2023.localMI$quadrant <- quadrant
colors <- c("#ffffff", "#2c7bb6", "#abd9e9", "#fdae61", "#d7191c")
clusters <- c("insignificant", "low-low", "low-high", "high-low", "high-high")
tm_shape(Battles_2023.localMI) +
tm_fill(col = "quadrant",
style = "cat",
palette = colors[c(sort(unique(quadrant)))+1],
labels = clusters[c(sort(unique(quadrant)))+1],
popup.vars = c("")) +
tm_view(set.zoom.limits = c(11,17)) +
tm_borders(alpha=0.5)
We can also plot the choropleth map of incidents of Battles in 2023 together with the LISA map
Incidents <- qtm(Battles_2023, "Incidents")
Battles_2023.localMI$quadrant <- quadrant
colors <- c("#ffffff", "#2c7bb6", "#abd9e9", "#fdae61", "#d7191c")
clusters <- c("insignificant", "low-low", "low-high", "high-low", "high-high")
LISAmap <- tm_shape(Battles_2023.localMI) +
tm_fill(col = "quadrant",
style = "cat",
palette = colors[c(sort(unique(quadrant)))+1],
labels = clusters[c(sort(unique(quadrant)))+1],
popup.vars = c("")) +
tm_view(set.zoom.limits = c(11,17)) +
tm_borders(alpha=0.5)
tmap_arrange(Incidents, LISAmap,
asp=1, ncol=2)
We can also include the local Moran’s I map and p-value map as shown below for easy comparison.
tmap_arrange(localMI.map, pvalue.map, asp=1, ncol=2)
To see the full effect of the LISA maps by event type, we will need to view this over time to see how it has changed over time.
LISA Maps for Battles in 2020-2023
2020

2021

2022

2023

For our Shiny App, I will combine these 4 plots into a single page for users.
Hot Spot and Cold Spot Area Analysis
Beside detecting for clusters and outliers, Localised spatial statistics can be also used to detect hot spot and/or cold spot areas.
The term ‘hot spot’ has been used generically across disciplines to describe a region or value that is higher relative to its surroundings (Lepers et al 2005, Aben et al 2012, Isobe et al 2015).
Getis and Ord’s G-Statistics
An alternative spatial statistics to detect spatial anomalies is the Getis and Ord’s G-statistics (Getis and Ord, 1972; Ord and Getis, 1995).
It looks at neighbours within a defined proximity to identify where either high or low values cluster spatially.
Here, statistically significant hot-spots are recognised as areas of high values where other areas within a neighbourhood range also share high values too.
The analysis consists of three steps:
Deriving spatial weight matrix
Computing Gi statistics
Mapping Gi statistics
Deriving distance-based weight matrix
First, we need to define a new set of neighbours. Whist the spatial autocorrelation considered units which shared borders, for Getis-Ord we are defining neighbours based on distance.
There are two type of distance-based proximity matrix, they are:
fixed distance weight matrix; and
adaptive distance weight matrix.
Deriving the centroid
We will need points to associate with each polygon before we can make our connectivity graph.
We need the coordinates in a separate data frame for this to work. To do this we will use a mapping function. The mapping function applies a given function to each element of a vector and returns a vector of the same length.
Our input vector will be the geometry column of us.bound. Our function will be st_centroid(). We will be using map_dbl variation of map from the purrr package.
To get our longitude values we map the st_centroid() function over the geometry column of us.bound and access the longitude value through double bracket notation [[]] and 1. This allows us to get only the longitude, which is the first value in each centroid.
longitude <- map_dbl(Battles_2023$geometry, ~st_centroid(.x)[[1]])We do the same for latitude with one key difference. We access the second value per each centroid with [[2]].
latitude <- map_dbl(Battles_2023$geometry, ~st_centroid(.x)[[2]])Now that we have latitude and longitude, we use cbind to put longitude and latitude into the same object.
coords <- cbind(longitude, latitude)Determine the cut-off distance
Firstly, we need to determine the upper limit for distance band by using the steps below:
Return a matrix with the indices of points belonging to the set of the k nearest neighbours of each other by using knearneigh() of spdep.
Convert the knn object returned by knearneigh() into a neighbours list of class nb with a list of integer vectors containing neighbour region number ids by using knn2nb().
Return the length of neighbour relationship edges by using nbdists() of spdep. The function returns in the units of the coordinates if the coordinates are projected, in km otherwise.
Remove the list structure of the returned object by using unlist().
k1 <- knn2nb(knearneigh(coords))
k1dists <- unlist(nbdists(k1, coords, longlat = TRUE))
summary(k1dists) Min. 1st Qu. Median Mean 3rd Qu. Max.
14.26 49.33 66.03 71.79 82.19 196.85
The summary report shows that the largest first nearest neighbour distance is 196.85 km, so using this as the upper threshold gives certainty that all units will have at least one neighbour.
wm_d62 <- dnearneigh(coords, 0, 62, longlat = TRUE)
wm_d62Neighbour list object:
Number of regions: 74
Number of nonzero links: 44
Percentage nonzero weights: 0.8035062
Average number of links: 0.5945946
46 regions with no links:
1 3 5 6 7 8 9 10 11 12 13 14 15 16 17 18 22 23 24 25 26 27 29 32 33 34
35 36 37 40 45 46 47 48 49 50 51 53 54 57 58 62 63 66 69 70
54 disjoint connected subgraphs
#wm62_lw <- nb2listw(wm_d62, style = 'B')
#summary(wm62_lw)#there is a problem with using fixed distance weight matrix, as our subset of Battles 2023, has some empty neighbours, as not all districts have battles in 2023.

Computing adaptive distance weight matrix
One of the characteristics of fixed distance weight matrix is that more densely settled areas (usually the urban areas) tend to have more neighbours and the less densely settled areas (usually the rural counties) tend to have lesser neighbours.
Having many neighbours smoothes the neighbour relationship across more neighbours.
It is possible to control the numbers of neighbours directly using k-nearest neighbours, either accepting asymmetric neighbours or imposing symmetry as shown in the code below.
knn <- knn2nb(knearneigh(coords, k=8))
knnNeighbour list object:
Number of regions: 74
Number of nonzero links: 592
Percentage nonzero weights: 10.81081
Average number of links: 8
Non-symmetric neighbours list
Next, nb2listw() is used to convert the nb object into spatial weights object.
knn_lw <- nb2listw(knn, style = 'B')
summary(knn_lw)Characteristics of weights list object:
Neighbour list object:
Number of regions: 74
Number of nonzero links: 592
Percentage nonzero weights: 10.81081
Average number of links: 8
Non-symmetric neighbours list
Link number distribution:
8
74
74 least connected regions:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 with 8 links
74 most connected regions:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 with 8 links
Weights style: B
Weights constants summary:
n nn S0 S1 S2
B 74 5476 592 1036 19636
Computing Gi statistics
Gi statistics using adaptive distance
The code below is used to compute the Gi values for Incidents of Battles in 2023 by using an adaptive distance weight matrix (i.e knb_lw).
fips <- order(Battles_2023$DT)
gi.adaptive <- localG(Battles_2023$Incidents, knn_lw)
Battles_2023.gi <- cbind(Battles_2023, as.matrix(gi.adaptive)) %>%
rename(gstat_adaptive = as.matrix.gi.adaptive.)Mapping Gi values with adaptive distance weights
Now we visualise the locations of hot spot and cold spot areas. The choropleth mapping functions of tmap package will be used to map the Gi values.
The code below shows the functions used to map the Gi values derived using fixed distance weight matrix.
Events_Years<- qtm(Battles_2023, "Incidents")
Gimap <- tm_shape(Battles_2023.gi) +
tm_fill(col = "gstat_adaptive",
style = "pretty",
palette="-RdBu",
title = "local Gi") +
tm_borders(alpha = 0.5)
tmap_arrange(Events_Years,
Gimap,
asp=1,
ncol=2)
To see the full effect of the HOT & Cold spots by event type, we will need to view this over time to see how hot and cold spots have changed over time.
Hot & Cold Spots for Battles in 2020-2023
2020



2023

References
Main reference: Kam, T.S. (2024). Local Measures of Spatial Autocorrelation